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(57) Abstract: A method for IP routhing lookup to determine where to 
forward an IP -datagram with a given destination address by retrieving 
from a routing table a next/hop index indicating where to forward said 
datagram, said next/hop index being associates with the longest match- 
ing prefix of said destination address, said address being a number in an 
address universe U, whereing a set of address prefixes P and a mapping 
of P onto a set of next/hop indices D are converted into a set of ranges 
R, constituting a partition of U, and a mapping of R onto D. The method 
involves the steps of building and storing in a memory a forwarding ta- 
ble representation from R and D by using a predetermined layered data 
structure where the construction of the layer is selected depending on the 
range density !R"! for the sub-universe Unrepresented by that layer to get 
a space efficient representation of the set of ranges R, and performing 
the lookup by a range matching operation in said forwarding table. A 
corresponding system comprises a first converting means for converting 
a set of address prefixes P into a set of ranges R constituting a partition of 
said universe U and a second converting means for converting the map- 
ping from P onto a set of next-hop indices D to an equivalent mapping 
from R onto D. The system also comprises data structuring means for 
forming predetermined layered datastructures T representing the rout- 
ing table, and building and memory means for building and storing a 
forwarding tablcrcprcscntation from R and D by using a predetermined 
layered data structure where the construction of the layer is selected de- 
pending on the range density !R'! for the sub-universeU' represented by 
that layer to get a space efficient representation of the set of ranges R, 
and means for performing the lookup by a range matching operation in 
said forwarding table. 
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Method and system for fast IP routing lookup 
using forwarding tables with guaranteed 
compression rate and lookup performance 

1 Technical Field 

The present invention relates to a method and a system for fast IP routing lookup, 
in a compressed forwarding table, to determine where to forward an IP-datagram 
with a given destination address by retrieving from a routing table a next-hop index 
indicating where to forward said datagram, said next-hop index being associated with 
the longest matching prefix of said destination address, said address being a number 
in an address universe U, wherein a set of address prefixes P and a mapping of P 
onto a set of next-hop indices D are converted into a. set of ranges R, constituting a 
partition of U 3 and a mapping of R onto D. The system according to the invention 
comprises a first converting means for converting said set of address prefixes P into 
a set of ranges R constituting a partition of said universe U and a second converting 
means for converting the mapping from P onto D to an equivalent mapping from R 
onto D. 

2 Background 

Internet is formed of a plurality of networks connected to each other, wherein each 
of the constituent networks maintains its identity. Each network supports commu- 
nication among devices connected to the networks, and the networks in their turn 
are connected by routers. Thus, Internet can be considered to comprise a mass of 
routers interconnected by links. Communication among nodes (routers) on Internet 
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takes place using an Internet protocol, commonly known as IP. IP datagrams (pack- 
ets) are transmitted over Links from one router to the next one on their ways towards 
the final destinations. In each router a forwarding decision is performed on incoming 
datagrams to determine the datagrams next-hop router: 

A routing or forwarding decision is normally performed by a lookup procedure in 
a routing table. Thus, IP routers do a routing lookup in the routing table to obtain 
next-hop information about where to forward the IP datagrams on their path toward 
their destinations. A routing lookup operation on an incoming datagram requires the 
router to find the most specific path for the datagram. This means that the router 
has to solve the so-called longest prefix matching problem which is the problem of 
finding the next-hop information (or index) associated with the longest address prefix 
matching the incoming datagrams destination address in the set of arbitrary length 
(i.e. between 0 and 32 bits) prefixes constituting the routing table. 

To speed up the forwarding decisions, many IP router designs of today use a 
caching technique wherein the most recently or most frequently looked up destination 
addresses and the corresponding routing lookup results are kept in a route cache. This 
method works quite well for routers near 4 the edges of the network, i.e. so called small 
office and home office (SOHO) routers, that have small routing tables, low traffic 
loads, and high locality of accesses in the routing table. Another method of speeding 
up the routers is to exploit the fact that the frequency of routing table updates, 
resulting from topology changes in the network etc., is extremely low compared to the 
frequency of routing lookups. This makes it feasible to store the relevant information 
from the routing table in a more efficient forwarding table optimized for supx^orting 
fast lookups. When changes to the routing table occurs, the forwarding table is 
partially or completely rebuilt. 

In P Gvpta Algorithms for Routing Lookups and Packet ClassificaUon, A Disser- 
tation Submitted to the Department of Computer Science and the Committee on 
Graduate Studies of Stanford University, December 2000, a forwarding table repre- 
sentation and a corresponding lookup procedure using merely Static Vector Nodes, 
i.e. direct addressing, in two steps (i.e. 2 memory accesses) is described. With a 
continuing increase of the use of Internet there is a constant need of improving the IP 
routing lookup. In WO 01/22667 a technique is described for making direct indexing 
more efficient by compressing vector nodes having few used buckets. In the described 
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system a forwarding table consisting of three levels of compressed pointer arrays are 
used and the cost for a lookup is 6 memory accesses in the worst case. In case of 
sparsely populated intervals the waste of memory space is considerable when using 
both these methods, even though the second is much better. The so called Lulea Al- 
gorithm is another forwarding table, described in WO 99/14906, where the memory 
space utilization is heavily improved but the cost for lookup is increased to 12 memory 
accesses* 

The purpose of the present invention is to render the IP routing still more efficient 
by introducing a new forwarding table representation that improves the memory space 
utilization even further while reducing the cost for a lookup to 4 memory accesses. 

3 Disclosure of the Invention 

This purpose is obtained by a method and a system of the kind defined in the intro- 
ductory portion of the description and having the characterizing features of Claims 1 
and 18 respectively. In the present invention a series of data structures is consequently 
developed for representing the forwarding table which data structures are adapted to 
the interval density of the subuniverse in question. Thus an optimum structure is 
selected for a certain interval density in order to minimize the memory needs. For 
each memory access a certain block size is introduced and starting from this size the 
data structure is adapted such that the memory access will be as efficient as possible. 
There is no problem to reduce the number of memory accesses for lookup if there is 
no limit for the available memory and, on the other hand, there is no problem to per- 
form a compression of the table if the number of required memory accesses for lookup 
can be disregarded. With the present invention both these quantities are minimized. 
According to another aspect of the present invention a computer program product 
is provided having computer program code means to make a computer execute the 
above method when the program is run on a computer. 

4 Brief Description of the Drawings 

To, further explain the invention embodiments chosen as examples will now be de- 
scribed in greater details with reference to the drawings on which Figure 1 is a flow- 
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chart showing the hierarchy of the difierent sub-data- structures and the corresponding 
lookup procedures according to the invention, Figure 2 illustrates next-hop index and 
sub-tree pointer encoding scheme used in a first step static vector nodes, Figure 3 
illustrates a Dynamic Flat Tree sub-data structure with 25 ranges showing the layout 
of the information with respect to cache line boundaries, Figure 4 illustrates a Dy- 
namic Layered Tree leaf and the encoding of range boundaries to next hop indices as 
well as the corresponding node, Figure 5 further illustrates the layout of information 
in a binary Dynamic Layered TVee data structure, Figure 6 shows the layout of infor- 
mation in a Static Flat Tree data structure, Figure 7 is a schematic block diagram 
of an embodiment of the system according to the invention, and Figure S shows an 
example of a small forwarding table represented by a 32 bits variation of the Dynamic 
Layered Tree data structure. 
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5 Description of Preferred Embodiments 

In the following the development of a series of data structures used to represent com- 
pressed forwarding tables supporting fast routing lookups is described. In particular, 
we develop numerous techniques that are combined to compress 2 18 routes into a 
forwarding table data structure using less than 2.7 Mbytes of memory and still not 
requiring more than 4 memory accesses for lookup. 

5.1 Preliminaries 

5-1.1 Some Notes on Notation 

In the following we will express ranges or intervals by using the dot notation for 
numbers a and 6. a < 6, we will denote ranges or sequences (depending on the context) 
involving a and b as follows 

&-1...6 = b-l,b 

a...b = a, a + 1. . . ft. 

We will also use various entities represented by bit strings. Unless explicitly stated 
otherwise, the bits are numbered starting from 0 and ending with the highest bit. Let 
x be a k + 1 bits bit string. Bit i is denoted by x [i] and the whole bit string can be 
written as 

. x = x[k]x[k-l]...x[l]x[0] 
= x[k...O]. 

The reason for starting with the highest bit is that we often consider the bit strings 
as non-negative integers where the bit string constitutes the binary representation. 
The size of a bit string x is denoted by \x\ , i.e. for x in the previous example we have 
that | a; | = k + 1. We can append bit strings by simply placing them after each other •••• 
z = xy is the bit string obtained when appending x to y. That is, z [|y| — 1 ... 0] =7/ 
and z [\y\ 4- \x\ — 1 . . . \y\] = x. The string obtained by repeatedly appending k copies 
of x is denoted by . x k . Le. 

X — XX » . . 3? 
k times 
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Finally, we will represent sets such as X = {x 1} x 2 , . - . , x n }. The size of X is denoted 
by \X\ and in this case, \X\ = n. In addition, we will use the standard set relations 
such as 6, c, n. U. and C in the following ways 

X C X 

{^o, ^i, . . . , x n } n {xi, x 2 , . . . , a? n +i} = X 
a? 2 , • • , ar*} U {a; m , Xj+ 2 , . . . ; a n } = X. 

5.1.2 Routing Tables and Routing Looktips 

A Touting table 'is a surjective mapping from a set of IP-address prefixes, or simply 
'address prefixes, to a set of next-hop indices. Routing tables are lised to determine 
where to forward an IP-datagram with a given destination address, as mentioned 
above. This is achieved by finding the longest matching prefix of the destination 
address and retrieve the associated next-hop index representing what to do with the 
packet. The process of performing the longest prefix match and retrieving the next- 
hop index is referred to as routing lookup. 

An IPv4- address is a number in the universe 

U = {0 ; 1, 2 32 ~l}. 

It is represented by a bit vector 

a = a [31] a [30] ... a [1] a [0] = a [31 ... 0] 

equivalent, to the binary representation of the number. Address prefixes are also 
represented by bit vectors. The prefix of length 0 is denoted by the wildcard symbol 
* and matches any address. Any other prefix, of non-zero length k, is denoted by 

p = p [31] p [30] ...p[i ] *=p[31...32-fc]* 

k bits 

and matches any address a satisfying 

a [31 ... 32 - jfe] = p [31 ... 32 - k] . 
6 
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Address prefixes also represents subsets of U consisting of ranges. That is, the prefix 
p [31 ... 32 — A;] * represent the range p l ° . . .p hi where the binary representation of p l ° 
and p hi are given by 

p [31 . - . 32 - jfe] OCL^O 

32 —k bits 

and 

p[31...32- k] Ij^l 

bits 

respectively. For any pair of prefixes Pi and P 2 we have either Pi C ?2, Pi = P2, or 
Pi D P 2 . It is rather straight forward to convert the set of prefixes to a set of n ranges 

R = {ft, R 2i ... 5 Rn} 

constituting a partition of U, where the first, ith, and last range are given by 

. ,, ft = r l 1 °...rf* = 0...rf < , 

and 

^=^...1^ = ^ +1...2 3B -1 

respectively. Moreover, each range will be associated with a next-hop index in the 
same way as the original prefixes, i.e. we have a subjective mapping from R to the set 
of next-hop indices. Thus, we can perform routing lookup by finding the only matching 
range of an address a, i.e. the ft satisfying r l ° < a < rf\ instead of performing the 
longest prefix match. The conversion can be performed as follows. Initially, let R 
be empty and let the current address a be 0. Then repeatedly insert R — r l ° . . . r hi , 
where r l ° = a and r hi is the smallest address larger than or equal to r associated with 
the same next-hop index as a, into R and assign r hl + 1 to a as long as a € U. 

We conclude this section with a siimmary of the steps performed so far and a more 
formal definition of the problems yet to solve. Let D = d 2 , . . . > d m } be the set of 
next-hop indices (the letter D/d is chosen to represent the looked up data). Moreover, 
let next-hop index d% be represented by a fig m] bits non-negative integer 

dj [[Igm] - 1] dj [fem] - 2] . dj [1] dt [0] . 

fig ml bits 
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Initially, we had a set of prefixes P and a mapping of P onto D. By the longest prefix 
match operation we also have a mapping of U onto D. Finally, we have described 
a method for converting P and the mapping of P onto D to a set of ranges R, 
constituting a partition of U, and a mapping of R onto D. In effect, we have converted 
the original longest prefix match problem in P to the simpler but equivalent (only) 
range matching problem in R. 

The conversion described can be applied to any routing table representation to 
retrieve this partition of ranges (and corresponding next-hop indices) serving as an 
intermediate representation from which the forwarding table is built (see Figure 7 bp2 
and bp3). The core of the present invention: 

• the efficient forwarding table representation constructed from the intermediate 
range partition representation and 

• the lookup procedure used to retrieve the next-hop. index associated with only 
matching range from said intermediate representation 

can therefore be used in conjunction with any routing table implementation to accel- 
erate the routing lookups. 

5.1.3 Computational Model 

The worst case, from the point of view of costs, for performing a computation such as 
a routing lookup in a fairly large data structure is mainly dependent on the number 
of memory accesses performed. In each memory access, a cache line or block of k 
consecutive bytes is accessed from the main memory. The block is then stored at all 
levels of the cache hierarchy before the actual data is accessed in the fastest first level 
cache. Since the dominating cost for a memory access is the copying of the block from 
main memory, we count multiple accesses to the same k bytes block during a short 
computation such as a lookup* as one memory access. 

5.1.4 Design Parameters 

In our target architecture, e.g. the Intel Pentium III, each cache line consists of 32 
bytes and this is the value of k we will use throughout the design of the algorithm.. 
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It is estimated that the size of the largest routing tables will exceed 200000 routes 
in the next few years. High-end routing table data structures and implementations of 
today are therefore designed to accommodate for 2 18 = 262144 routes and the same 
number of next-hop indices. Hence, w r e need to use 18 bits to represent a next-hop 
index. The conversion described in the previous section converts n prefixes to at most 
2n + 1 ranges. Therefore, our data structure is designed to store 2 • 2 18 + 1 = 524289 
ranges. Our design goal is to perform a routing lookup in at most 4 memoiy accesses. 

5*2 Compressed Forwarding Table Representation 

The basic principle behind the compressed forwarding table is to repeatedly reduce the 
size of the sub- universes in up to four steps corresponding to the design parameter of 4 
memory accesses. In the present invention, the size of each sub-problem instance (i.e. 
number of ranges) with respect to the size of the sulvxmiverse is carefully observed 
to determine the best approach for representing the ranges in that particular sub- 
universe. 

« ; ■ t . 

5.2.1 Overview of the Data Structure 

In order to get a better understanding of the basic data structure used in the present 
invention and the following principles for size optimization, we will assume for now 
that the method of repeated reduction is used exclusively. 

. . Initially, we have a set of n ranges R = {Rx, R2, - . . , R n } constituting a partition 
of the original universe U. The first step is to partition U into 2 16 sub-universes Uo ? 

. . . U 2 ie„i of size 2 16 each. This is achieved by starting with 2 16 empty sets of 
16 bits ranges R 0? Ri ; . . . 5 R216-1 and repeating the following until R is- empty. Let 
R^r lo ..:r hi €Rands = r Zc, [31...16]. If ■ . 

r Zo [31...16] - r hi [31 . . . 16] 

then remove R from R and add r l ° [15 . . . 0] . . . r ki [15 . . . 0] to H s . Otherwise, add 

r l ° [15 . . . 0] , . . (2 16 - l) 

to R s and replabe R by 2 1G - (s + 1) . , . r hi in R. Upon completion of the procedure, 
R is empty and the n 32 bits ranges are converted into n or more 16 bits ranges 

' • 9 : 
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distributed among the 2 16 range sets. For each K L containing only one range, let 
Ti be the next-hop index of that range. Otherwise, let I; be the data structure 
(not yet described) representing the ranges of and supporting the 16 bit range 
matching operation for looking up the next-hop index. By organizing pointers to Jo, 
T u . . . ,and T 2 i6_ 1 in a Static Vector Node, or pointer array, of size 2 16 ? we obtain a 
data structure supporting complete 32 bit range matching. Given an IPv4 address 
a, we first extract the 16 most significant bits a [31 . . . 16] and use these to index into 
the array to retrieve r a[3 i... 16] . We then lookup a [15 ... 0] in T a[31 ... 16j using the 16 bit 
range matching operation supported by the sub-data structure. 

Let |Ri) > 1 and consider the representation of T it By applying the same idea as 
above, we partition U< into 2* sub-universes U i}0 , Uy, . . M tl^-i. This is followed 
by processing the set of ranges R, : into K if0l R^, . . . , R^-i arid representing these 
by sub-data structures 

Range matching in T- is achieved by using a [15 ... 8] to lookup F i>a ri 5 ... 8 ] by direct 
indexing in the static vector node, followed by looking up a [7 . . . 0] in r i)tt r 15 .,. S ]. 
' We apply the same idea on the T^s by partitioning each U* j into 2 4 sub-universes 

The resisting range sets, the R< j,*s, are represented by data structures 

Lookup is achieved by using a [7 ... 4] to index into the static vector node containing 
pointers to the sub-data structures. The result is either a next-hop index or an array 
of 2 4 next-hop indices which is indexed, using the 4 least significant bits a [3 ... 0] or 
the IPv4 address, to complete the lookup. 



Level 


|U| 


min |R| 


Size 


Address bits 


Pointer size (bits) 


1 


2 32 


oo 


2 16 


31.. .16 


32 


2 


2 18 


2313 


2 s 


15 ... 8 


20 


3 


2 s 


137 


2 4 


7...4 


20 


4 


2*. 


12 


2 4 


3. ..0 


18 



In the table above, we give a summary of the figures related to the repeated 
partitioning and usage of static vector nodes. For a given level, |U| is the size of 
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the sub-uruverse represented at that level. When introducing space optimizations, we 
will use alternative and more space efficient approaches instead of static vector nodes 
if the number of ranges is less than mm |R| , The size column shows the number of 
pointers and the Address bits column shows which bits of the IPv4 address are being 
used for indexing at each level Depending on the level and alignment conditions for 
the various sub-structures we will use, the Pointer size } or number of bits required 
for representing a pointer, varies somewhat between the levels. For example, we only- 
need 18 bits at the lowest level since we can guarantee that the result from the lookup 
is a next-hop index. 

In what follows, we give an overview of the various sub-data structures we will use 
and in the next few sections we will describe each kinds of structure in detail. 



Lvl 


I'D l max 


Structure 


Height 


Byte alignment 


2 


2 ... 29 


Dynamic Flat Tree 


1...3 


' 4 


' 2 


30... 201 


Dynamic Layered Tree 1 


1...3 


8 


2 


201 .. . 2312 


Dynamic Layered Tree 2 


1 . .'. 3 


32 


2 


2313. ..2 10 


Static Vector Node 


N/A 


2048/3 


3 


2... 136 


Dynamic Layered Tree 3 


1...2 


32 


3 


137... 2 s 


Static Vector Node 


N/A 


128/3 


4 


2... 8 


Dynamic Layered Tree 3 


1 


32 


4 


9. ..11 


Static Flat Tree 


1 


32 


4 


12 . . . 2 4 


Static Vector Node 


N/A 


256/7 



The first column shows the level of the data structure in question. Level 1 is the top 
level, i.e. the static vector node with 2 la pointers. At each level, w r e will determine 
which data structure to use based on the number of ranges (see Figure 1). For each 
level and data structure we show the range density interval, in the |R|^ column, for 
which that data structure offers the most space efficient representation. In the third 
column, we show the name of the data structure and the fourth column contains the 
height of the data structure measured in the number of memory accesses required to 
perform the lookup. Observe that we use three slightly different kinds of dynamic 
layered trees. All data structures starting at level 2 are at most 3 levels high except 
for the static vector node where the concept of height is not directly applicable. This 
means that no matter which of these we use, no more than 3 memory accesses is 
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spent for completing the lookup after the first memory access at level 1. In the 
same way, the data structures starting at level 3 are at most two levels high and 
the ones starting at level 4 are at most 1 level high. Hence, we will never use more 
that 4 memory accesses for a complete lookup. In the last column we show the 
byte alignment of the various structures (for static vector nodes, we show alignment 
for 3 and 7 node packages respectively). The alignment is important in the following 
discussion regarding pointer sizes and pointer space utilization for static vector nodes. 

Figure 1 is a flowchart illustrating the hierarchy of the different data sub-structures 
and the corresponding lookup procedures according to this embodiment of the inven- 
tion. Level 1 in the figure is a Static Vector Node (SYN) of size 2 W == 65536 pointers. 
If the number of ranges |R| at level 2 does not exceed 2312 one of the sub-data struc- 
tures Dynamic Layered Tree 1 , Dynamic Layered Tree 2 or Dynamic Flat Tree is used 
for representing the set of ranges and the corresponding lookup procedures are used 
for completing the lookup. Thus, if 201 < |R| < 2312 the Dynamic Layered Tree 1 
sub-data structure representation is used. If 29 < |R| < 201 the sub-data structure 
Dynamic Layered Tree 2 is used, and if 1 < |R| < 29 the binary sub-data structure 
Dynamic Flat Tree is used. Otherwise, if |R| > 2312 at level 2, a static vector node 
SVN lookup is performed and the lookup continues at level 3. If the number of ranges 
)R] at level 3 does not exceed 136 the Dynamic Layered Tree 3 sub-data structure, rep- 
resentation is used. Otherwise for 136 < |R| < 2 8 (= 256) a SVN lookup is performed 
and the lookup continues at level 4. Finally, at level 4, if the number of ranges |R| 
does not exceed 11 one of the sub-data structures Dynamic Layered Tree 3 or Static 
Flat Tree is used for completing the lookup procedure. Thus, if 8 < |R| < 11 the 
binary sub-data structure Dynamic Layered Tree 3 is used, and if 1 < |R| < 8 the 
sub-data structure Static Flat Tree is used. Otherwise, for 11 < |R| < 16 an SVN is 
used for completing the lookup and obtaining the desired enclosed next-hop index. 

5-2,2 Static Vector Nodes 

The level 1 or top level data structure consists of a root node with 2 16 potential 
sub-trees. The 16 most significant bits of the IPv4 address are used to index into 
the array representing the root node to extract a 32 bits non-negative integer that 
either encodes an index into the next-hop table or a pointer to a sub-data structure as 
shown in the table above. As mentioned above, we will use different kinds of sub- trees 
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depending on the number of range boundaries that lies in the sub-universe, i.e. the 
density of the sxxlj-nniveree. The encoding of next-hop indices and sub-tree pointers is 
described in Figure 2, which shows the next-hop index and sub-tree kind and pointer 
encoding scheme used in level 1 Static Vector Nodes. 

Observe that for densities in the range 2 ... 201 we must encode the size of the 
sub-tree exactly to accommodate for quantization reduction space optimizations. Such 
optimizations are not required when the density exceeds 201 (encoded as 202). 

At level 2 and 3 we will use 20 bits pointers in the Static Vector Nodes. We 
reserve the two most significant bits for encoding the pointer kind. The remaining bits 
are sufficient to encode 18 bits next-hop indices as well as 18 bits pointers referring 
to Dynamic Layered Trees, Static Flat Trees, or Static Vector Nodes. These data 
structures are 32 bytes aligned. This means that 18 bits is sufficient to address a 
memory area of 2 18 • 32 = 8388608 bytes. 

The 20 bits pointers used at level 2 are not so easily aligned. However, since the 
only requirement on a vector node is that it is possible to index among the pointers 
and retrieve one pointer in one memory access, we are able to use some tricks to 
achieve a space efficient representation. In one cache line, we can fit [^J =12 
whole 20 bits pointers. By using 21 + 1/3 cache lines, we can store (21 + 1/3) • 12 = 
256 pointers to represent a static vector node at level 2 (see Figure 1). To improve 
alignment conditions, we pack level 2 nodes together in groups of three. Each package 
requires (21 + 1/3) • 3 • 32 = 2048 consecutive bytes of storage resulting in a byte 
alignment of 2048/3 (2048 bytes block per package of 3 nodes). 

We can use the same approach do represent the level 3 nodes. By using 1 + 1/3 
cache lines, we can store (1 + 1/3) • 12 = 16 pointers to represent a Static Vector Node 
at level 3 as appears from Figure 1. The ahgnment is improved in the same fashion as 
above, by again packing the nodes together in groups of 3. At this level, each package 
requires (1 + 1/3) • 3 • 32 = 128 bytes resulting in a byte alignment of 128/3. 

At the lowest level, La level 4, the pointers in the static vector nodes must 
represent next-hop indices. Therefore, we only need 18 bits pointers at this level. In 
one cache line, we can fit |2SSj = u whole 18 bits p 0in ters. By using 1 + 1/7 cache 
lines, we can represent the (1 + 1/7) • 14 = 16 pointers required for a complete level 4 
node. We pack the nodes together in groups of 7 to achieve better alignment. Each 
package occupies a memory area of (1 + 1/7) • 7 • 32 = 256 bytes, resulting in a byte 



13 

SUBSTITUTE SHEET (RULE 26) 



WO 03/063427 



PCT/SE03/00064 



alignment of 256/7. 

5.2.3 Dynamic Flat Trees 

Let R = {R u Rz, ... . , R n } be the set of ranges and d u d 2 , . . . , d n be the associated 
next-hop indices constituting the current sub-problem (2 < n < 29). Moreover, let 
ri,r 2 , . . . ,r n _i be the sorted list of ranges boundaries obtained by taking the r l ° from 
each range in R. By this construction, we have that d t is associated with 0..r L , d 2 is 
associated with r x + l..r 2) and so on until finally d n is associated with r ri ._! + l..oo. 
Recall that each rf 4 is an 18 bits non-negative integer and let ht and k be the 2 most 
significant bits of <k and the 16 least significant bits of d { respectively. Moreover, let 

, H = h 1 h 2 . . . h n Q k , 

where k is the minimum number of zeroes that needs to be appended in order achieve 

\H\ = 0 (mod 16) . 

After'pacMng the high 2 bits of the next-hop indices and padding H with zeroes, the 
information we need to represent consists of 



n — 1 + n + 



= 2n - 1 + 



16 bits blocks. To improve the alignment conditions, we want to use an even number 
of 16 bits blocks. Therefore, we add one empty block if 2n - 1 + [f ] is odd. We then 
get a total of 

\2n - 1 + |~g~ 



u l6 (n) = 2 



2 



16 bits blocks. 

Let B 1} B 2 , . . . ,B uMn) denote the information blocks and Si, S 2 , . . . , S Wie(n) denote 
the memory slots that will contain the information blocks. For n = 16 we have 



ttic(16) = 2- 
= 34 



2 • 16 - 1 + \f] 



16 + 16 + 2. 
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Since 32 blocks corresponds to exactly 2 cache lines and 16 bits blocks are packaged 
in pairs these 34 blocks can not be distributed over more than 3 cache lines. It follows 
that the search cost is bounded by 3 memory accesses no matter how we store the 
information blocks in the memory slots. Therefore, we store Bi in Si, B 2 in 1S2, ■ . 
and i? Ul6 ( n ) in S Uie ( n ) whenever n < 16. 
For n = 17 we have 



2-17 



u M (17) = 2- 
= 36 

= 2 + 16+16 + 2. 

If we do not pay attention to the storage of information blocks in the memory slots and 
the location of the memory slots with respect to cache line boundaries, the information 
may be distributed over 4 cache lines resulting in a search cost of 4 memory accesses. 

Let Si be the first memory slot located in the beginning of a cache line. For n > 17 
we store B x in S*, B 2 in jSi+i, JB 3 in Si+2, . . .Btti e {n)-(*-i) in S Uu] ( n ^ S ttie ( n )-(<-2) in 
fli, S ttl6 ( n )«(i^3) in 5 2 , . . • » and B uie ( n ) in 

The actxial information is finally stored in the information blocks as follows: 

• H is stored in Bx . . - ^[|] 

• ri . . . r n _i is stored in . . . flj-a-j^^ 

• ii . . . In is stored in -Bp f ]+ n . - . S [*] + 2n-i 

By accessing the first cache line containing B\ . . . Big we can extract the 2 most 
significant bits of the resulting next-hop index and also search among at the first raiige 
boundaries. In the second memory access, we complete the search among the range 
boundaries in B i7 , . . S 32 . The third memory access is used to extract the 16 least 
significant bits of the resulting next-hop index. 

In Figure 3 we show a.n example of the layout of the information with respect to 
cache line boundaries. The Figure illustrates a Dynamic Flat Tree sub-data structure 
containing 25 ranges. Observe that the last 16 least significant bits of the last three 
next- hop indices £23,^24? and Z25 are stored in the end of the first cache line. The 
lookup will start by accessing the second cache line in use. 
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5.2,4 Dynamic Layered Trees 

Dynamic Layered Trees consists of two building blocks, leafs and nodes. Leafs consists 
of up to 8 next-hop indices and up to 7 range boundaries packed in a cache line and 
nodes consists of up to 16 range boundaries packed in a cache line. Figure 4 illustrates 
the Dynamic Layered Tree leaf and the encoding of range boundaries and next-hop 
indices in the upper part of the figure, and the corresponding node containing only 
range boundaries in the lower part of the figxire. 

Let ri,ra, . - . , r n ~\ be a sorted list of range boundaries and di, d 2 , . * • >d n be the 
corresponding list of next-hop indices. As above, di is associated with 0..ri> d^ is 
associated with r\ + l..r 2? and so on until finally d n is associated with r n _i + l..oo. 

For n = 8, we construct a tree simply by storing the range boundaries and next-hop 
indices in a leaf as shown in Figure 4. 

For n = 136 = 17 - 8, the tree consists of 17 leafs, each representing 8 ranges, 
and 1 node representing 16 range boundaries. The first leaf contains n, r% xr * . ; , r*? and 
d i} d 2 , • • • s d$. the second leaf contains r 9 , ?^ 0 , . • . , 7^15 and dg ? d 10 , . . . , die, and so on 
until the last leaf which contains 7^29? ^130? • • • » ^135 di29? d^o? . . . , diae- That is, 
the ith leaf contains 

r8.(i-l)+l, r 8 .(<«i) + 2j • • ■ ; ^8»(/-i)+r 

and 

> dg.(i_i)+2, - . . , c? 8 .(*- 1)4-8- 

The node contains r§, ri<5. . . . , r$u • • • > ^128 • By searching the node in 1 memory access, 
we can determine in which leaf to complete the search. We can repeat this procedure 
to handle arbitrarily large sets of ranges and next-hop indices. For each level added, 
the number of ranges that can be handled increases by a factor of 17. Hence, by xising 
t levels j we can handle 



DLT-num (i) = 



8 , f or t = 1 

17 • DLTjnum (t — 1) , otherwise 



For t = 1 ? 2,3 we get the sizas 8, 136 and 2312 for complete trees of height 1, 2, and 
3 trees respectively. When n > 202 we can afford to use partially filled cache lines at 
all levels at the same time. In some cases } we will however store a leaf directly below 
a level 3 node. When arriving at that leaf, we need to a way to tell if it is a node or a 
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leal This is achieved by storing the first two range boundaries in decreasing order in 
a leaf (but not in a node), swapping them before searching, and swapping them back 
after completing the search. The extra cost for this encoding/decoding is negligible. 
The size of a t level complete tree, measured in number of cache lines, is given by 



DLTsize (t) 



1 , f or t = 1 

^ 1 + 17 * DLTsize (t - 1) , otherwise 

For n < 201, the cost for incomplete trees resulting in partially filled cache lines 
becomes too high with respect to the number of ranges handled. For example, when 
n is in the range 137 . . . 201 only one single 16 bits range boundary is used in the 
level 3 node. To reduce quantization effects we will try to store partially filled nodes 
and leafs as compact as possible without introducing extra memory accesses for the 
lookup. Let 



: 1-2- 

1.136. 
n2 = n mod 136 



&2 



k 



- m 



n L = n 2 mod 8 

For 30 < n < 201. k 2 is the number of complete level 2 trees and is the number of 
remaining ranges after storing as many as possible in complete level 2 trees. Similarly 
ki is the number of complete level 1 trees and n\ is the number of ranges remaining 
after storing as many as possible in complete level 2 and level 1 trees. If n > 136, 
we need a partial level 3 node consisting of exactly one range boundary. In addition, 
we need a partial level 2 node if n ^ 136 and finally, we need a partial level 1 leaf if 
m 7^ 0. The number of range boundaries we need to store in the partial level 2 node 
equals ki — 1 if ni = 0 and ki otherwise: By accounting for storing high 2 bits of 
next-hop indices, range boiindaries, and lowl6 bits of next-hop indices, the number 
of 16 bits blocks required to store the partial leaf becomes 

Ui (^) = l+ni + fai — 1) 

- 2n x 

= 2 • (ri2 mod 8) 

= 2- ((n mod 136) mod 8) 
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The total number of 16 bit blocks required to store the partial nodes and leaf is then 
given by 

u (n) = Ui (n) + u 2 (n) + u 3 (n) , 
where ui (n) s defined as above, 



u 2 (n) 



f &i — 1 , if 
\ h , ot 



_ JL 



8 . 

1 «a 

18. 



m = o 

otherwise 

1 , if n 2 mod 8 = 0 
, otherwise 



and 



a mod 136 



a mod 136 



-u 3 (n) = 



1 , if (n mod 136) mod 8 = 0 
, otherwise 

0 . if n < 136 



1 , otherwise 

We call the partial leaf and nodes the head of the tree and the remaining complete 
level 2 tree and level 1 nodes the tail of the tree. The tail consists of 32 byte blocks 
and is stored in a memory area designated for 32 byte aligned structures. The head, 
on the other hand, consists of sufficiently many 8 byte block for storing the u (n) 16 
bits blocks and a 32 bits pointer to the tail As shown in Figure 5, which shows a 
head with u x (n) = 12, u 2 (n) = 4, and u 3 (n) = 1, the tail pointer is stored in the 
first 8 bytes block together with the level 3 node (L3). Since each 8 byte block is 
completely contained within a cache line, at most 1 memory access is required for 
searching the head at level 3, leaving 2 memory accesses as required for completing 
the search at level 2 in the tail- The maximum value of 112 (n) is 16 (complete level 
2 node). This means that the level 2 note stored in the head can not be distributed 
over more than 2 cache lines. The first part of the node will be stored in the same 8 
byte block (and therefore also the same cache line) as the tail pointer, which means 
that the first memory access is already accounted for. In the second memory access 
we are guaranteed to complete the level 2 search in the head, leaving the third and 
last memory access, for completing the search in a leaf stored either in the head or in 
the tail. 

At level 3 and 4 range sub-sets from the same sub-universe U* (first partitioning 
step) can share Dynamic Layered Tree as long as the number of memory accesses for 
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completing the lookups does not exceed 2 and 1 respectively. For example, consider the 
three range sets R^, R Ua , and Bnj^k where j x < j 2 < h, |R*,ji| = 87, |R iji2 | = 44, 
and |Rxj 3 ,fe| = 5. These can be represented using a complete level 2 DLT since the 
total number of ranges is 87 + 44 + 5 = 136 and the choice of ji , j 2 , and j$ guarantees 
that the representation of R^,/* does not straddle a cache line boundary. The pointers 
referring to the sub-structures representing Hiji Rij 2 w ^ refer to the complete 
level 2 tree and the pointer referring to the sub-structure representing Ry 3) fc will refer 
to the leaf of the complete 2 tree containing R^a- 

5.2.5 Static Flat Ttees 

A Static Flat Tree is used at level 4 to represent a sorted list of 4 bit range boundaries 
rti ?'2, - - the corresponding list of next-hop indices d l7 • . - , d n . It is used 

only when n equals 9, 10, or 11. As above, d\ is associated with O.lri, cfo is associated 
with;ri + l..r 2? and so on until finally d n is associated with r n _i + l.,oo. The data- 
structure and the search method is basically the same as a dynamic layered tree leaf 
but with one difference. Instead of storing 7 16 bits range boundaries and 8 next-hop 
indices, we store TO 4 bits range boundaries and 11 next- hop indices (we fill the tree 
up to 11 ranges even if we have 9). The total number of bits required for this is 
10 • 4 + 11 ■ 18 = 238. Figure 6 illustrates a Static Flat Tree containing 11 ranges with 
8 unused bits in the JZ-area and 10 in the i?~area. 

5,3 Compressed Forwarding Table Lookxip 

Up to this point, we have described how to convert the prefix matching problem into a 
range matching problem. We have also described all the pieces necessary for achieving 
a space efficient representation of the set of ranges - a representation in which we ; can 
perform the range matching operation in four memory accesses, thereby achieving an 
efficient forwarding table representation. 

1 In this section, we describe the lookup procedures in more detail. 

5-3.1 Main Lookup Procedure 

The main lookup function is represented by the static vector node lookup SVN -lookup. 
It accepts two parameters, the IPv4 address a and a pointer to the data structure 
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T. Initially, T refers to the level 1 SVN containing T Q , . . . , T 2 x^ 1 (see Figure 1). The 
array elements are all 32 bits non-negative integers so indexing and retrieving the sub- 
level pointer is straight forward. In line 1 . . .12 the first level lookup is performed. 
Depending on the result from encoding the pointer value, the lookup is either com- 
pleted (line 3) or continues by calling a custom lookup procedure (lines 6, 8, or 10). 
If neither of these applies, the lookup continues with the next level SVN (line 13). 
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SVNlookuviT a) 










1 


if T f31 241 = 0 then 




9 


return T [23 ... 0] 




3 


elsif T [31 241 =^ 1 then 




4 


if T [31 . . . 24] < 29 then 




5 


return D FT lookwn (T [23 01 T [31 241 a flfi (Yh 




u 


elsif T [31 241 < 201 then 




7 
< 


return DLT -lookup 1 (T [23 . . . 0] , T [31 . . . 24] , a [15 ... 0]) 




8 


else 




g 


return DLT lookuv 2 (T [23 01 a [15 


• U J/ 


10 


end 




11 

XX 


end 




12 


T <— T [23 . . . 0] 




13 






1 4 


if T [19 181 = 0 then 




15 


return 7 1 [1 7 01 

A U (XX JLX X I X I ... WJ 




1U 


elsif T [19 181 ^ 1 then 




17 

X i 


return DLT' loohur? (T fl7 01 ^ [15 01 ) 




18 

xo 


end 




19 

Xt/ 


T *- T [17 . . . 0] 




20 


2 Tn\7 A\ 




21 


if T flQ 181 =0 then 

XX X j X £J ... X O J VJ v XXv^XX 






return T [17 ... 0] 




23 


elsif T flQ 181 1 then 




94 


if T [19 1 81 = 2 then 

XX X IX C7 . . • Xt_JJ ' s-i tllCXJL 




9B 


return D LT lookuif* (T fl7 01 a [15 


Oil 


26 


else 




27 


return SFTJookup (T [17 ... 0] , a [15 . . . 


0]) 


28 


end 




29 


end 




30 


T 4- T [17 . . . 0] 




31 


return T a[Sf (i0] 




32 
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The second level SVN lookup begins in line 14 by indexing into the array rep- 
resented by T which contains pointers T 0y . . . J T 2 8„ 1 . Level 2 nodes are packaged in 
groups of three. The 16 most significant bits of the pointer are used to address the 
group and the 2 least significant bits are used to locate the node within the group. 
Within -the group, the first four pointers in each cache line belongs to the first node, 
the next four pointers in each cache line belongs to the second node, and the last 
four pointers within each cache line belongs to the third node. The high bits from 
the address a [15 ... 10] are used to locate the cache line, the 2 least significant bits 
from the node pointer are used to locate the group of four pointers within the cache 
line, and the low bits from the address a [9 ... 8] are use to index within the group 
of pointers. Depending on the result from encoding the retrieved pointer, the lookup 
is either finished (line 16), continued by calling the custom procedure (line 18), or 
continued with the next SVN level (line 20). 

When entering the third level of the lookup, T refers to an array containing pointers 
Tp, . . . , T 2 t-i. Also level 3 nodes are packaged in groups of three and the pointers are 
interpreted in the same way as at level 2 to retrieve the location of the group and the 
node within the group. The indexing is basically the same. We use the high bits from 
the address a [7 ... 6] to locate the cache line, and the 2 least significant bits from the 
node pointer to locate the group of four pointers within the cache line, and the low 
bits from the address a [5 ... 4] are use to index within the group of pointers. After 
decoding the pointer, we know whether the lookup is finished (line 23) or whether to 
continue the lookup in a custom procedure (line 26 or 28). Otherwise, the lookup is 
continued a.t the next and final level - level 4. 

At level 4, T refers to an array containing next-hops T 0 , . . . , T 2 4_ 1 . The nodes are 
packaged together in groups of 7. Therefore, the 15 most significant bits of the node 
pointer are used to locate the group and the 3 least significant bits are used to locate 
the node within the group. We use the same kind of organization within each cache 
line as above. That is, the first two (instead of four) pointers belongs to the first 
group, the next two pointers belongs to the second group and so on until, the last 
two pointers that belongs to the seventh group. Consequently, we must use the high 
address bits a [3 .... 1] to locate the cache line, the lowest three pointer bits to locate 
the group if two next-hop indices within the cache line, and the lowest address bit 
a [0] to index within the group to retrieve the next-hop index. 
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5.3.2 Index Search 

The key operation in all custom lookup procedures (except SVNJookup) is an efficient- 
binary search that is used to compute indices into arrays of next-hop indices. Given a 
array of range boundaries r 0 , . . . , ?v_i, sorted in increasing order, and an address a we 
compute the minimum index i satisfying a < r*. If no such n is present, the result is 
n. Throughout the procedure, we heavily exploit pointer arithmetics, combined with 
interpretation of boolean values as numerical values, to achieve a procedure with a 
minimum of conditional branches. That is, for a pointer r representing the address 
to the first element of the array we can write r.j as (r + i) 0 . Moreover, the boolean 
values false and true will be used as 0 and 1 respectively in computations. 



ix search (r,n, a) 






b <— r ' 




1 


k <- [lgnj 




2 


m <— 2 k 




3 


r -f— r + (n — m) • (r m - 


i < a) 


4 


while k > 0 do 




5 


k <— k — 1 




6 


r <— r + (r 2 fc < a) 


• 2* 


7 


end 




8 


return (r — b) + (r 0 < 


d) : 


9 



In lines 1 to 3 we record the address of the first element in 6, compute floor of the 
base 2 logarithm of n and assigns it to fc, and compute 2 to the power of k and assigns 
the value to m. The idea is to quickly determine if the search is to be performed among 
the m first range boundaries or among the m last range boundaries. When this is 
done, we can repeatedly compare the middle element and cut 'an array of size 2 k in 
half (decreasing k in each step) until only one element remains. The actual decision 
and possible modification of r is performed in hue 3. If r m -i > a the numerical value 
of (r m -i < a) is 0 and r is assigned to itself (no modification). Otherwise, the start 
address of r is moved 1 n — m steps forward. In either case, the remaining search is 
performed among m = 2 k elements. 

The query key a is repeatedly compared to the middle element and r is modified 
to reflect thp result of the comparisons until only one element remains (lines 5 ... 8). 
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Finally, the difference between the location of the first element in the original array and 
the element remaining is computed and after adding the result from the comparison 
with the last element the result is returned (line 9). 

5.3.3 Dynamic Flat Tree Lookup 

The dynamic flat tree structure is stored in an array of 16 bits memory slots 

Ti.T 2? . . . jT|y| +2n _ r 

The actual organization of the tree depends on the size and the location T of the first 
memory slot with respect to cache line boundaries. If n < 16 or the first memory slot is 
located at the beginning of a cache line the tree is organized as follows: H = h% 9 . . . , h n 
is stored in T x |-'n-j , rx, . . - , r n _i are stored in ^["^"j 4-1 j^"] + n _i j an( A h> • - • ; In are stored 
in Tpn-j^ j-n*j_|_2n-r procedure begins by computing the cache line boundary 
offset q (line 1). Depending on the value of n and q (q = 0 implies that the first memory 
slot is located at the beginning of a cache line) ? the lookup is either completed in line 3 
and 4 or continues with the more complex cases in line 6. In line 3 the index (0 ... n~~ 1) 
of the next-hop index is computed using ix^search, and in line 4 the resulting next- hop 
index is assembled (and returned) by appending hi+i = T\ ± 1 [i mod 8 . . . (i mod 8) + 1] 



to l w -Tp^ +n+ . r 

DFTJookup(T 7 n, a) 

q <- (16 - (Tmod 16)) mod 16 1 

if (n < 16) V (q = 0) then 2 

i <- ixsearch ((^f ]+i> • - • ^j^]^) , a) 3 

return Tjjj [i mod 8 . . . (i mod 8) + 1] +n+i 4 

end 5 

i *~ ixsearch (^(T q ^^ v . . . 5 T^-j^^ , a)' 6 

if z < n — g then 6 

return Tjjj [t mod 8 . . . (i mod 8) + 1] T 1 ^]^^ "7 

end 8 

return Tj * 1 [i mod 8 . . . (i mod 8) + 1] Ti+ q - n +\ 9 
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In tlie more complex cases (n > 16 and q ^ 0) the tree is organized as follows: 
H = h x , . . . , /i n is stored in T^ +1 Q+ jyj , r x , . . . , r»_i are stored in j-^-j + ^ H _ r ^ 1 , 

iit---,in-9 are stored in T g+ ^] + ^.. 2 n-i4-ff ] > and ^-<?+i> • ■ • »k are stored in Ti..^. 
The computation of the index (line 6) is essentially the same as above except for 
considering the cache line boundary offset. If the low 16 bits from the resulting next- 
hop index are located at the end of the occupied memory area (i < n—q), the retrieval 
and assembly of the pieces are straight forward (line 7). Otherwise, the Iow T 16 bits 
from the resulting next-hop index are located at the beginning of the memory area. 
Retrieval and assembly is finally performed in line 9. 

5.3.4 Dynamic Layered Tree Lookup 

We distinguish between two major kinds of dynamic layered tree lookup. The straight 
forward version DLT -lookup 2,3 (two minor kinds within this major), is used either 
when the trees are complete or when the size is large enough to allow us to disregard 
quantization effects . When the trees axe small it is necessary to pay attention to 
quantization effects. As described in Section 5.2.4, this requires a more complex 
representation and the need for a corresponding lookup procedure - DLT -lookup 1 . 

The DLT -lookup 1 procedure accepts three arguments: T,n, and a, where T rep- 
resents the head of the tree structure stored in an array of 16 bits memory slots 
Tx, T 2 , . . . , and n and a are the size and query address (least significant 16 bits of 
the original address) respectively. In line 1 we extract the pointer to the tail of the 
tree and assigns the value to t for later use. Since the starting point of the tail is 32 
byte aligned, t is actually an index to a cache line. This is followed by initializing 
an offset variable o to 0 (line 2). If f-jjggj > 1 as tested in line 3. the head contains 
a partial level 3 (L3) node. Since n < 201 < 2 • 136, we know that the partial L3 
node contains exactly one range boundary stored in T a . If a < T 3 the straight forward 
lookup procedure is called in line 5 to finish the lookup in the complete L2 sub-tree 
stored in the beginning of the tail and referred to by t. Otherwise, the complete L2 
tree is skipped by adding its size to t (line 7) and the offset o is increased by the 
size (i.e. 1) of the partial L3 node (fine 8). In line 10 the size n 2 (number of range 
boundaries) of the partial L2 node is computed, followed by the index search (line 

ii)- . . . , . , • 
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DLT '-lookup 1 (T, n, a) 










1 

X 


of 0 




2 


if r*i > 1 then 






if a < T 3 then 






return DLTJookup 2 ^ (f,2 s a) 


5 


end 




6 


t «- t + 1 + 17 




7 


o o + 1 




8 


end 




Q 


n 2 +- [nafMfi] 1 




10 


? «— , ixsearch ( {T2+0+1 ? • • • ? ^2+0+^2 ) ? n 2 ? a) 


11 


if i < n-2 then 




12 


return DLTJookup** 3 


(£ + i, 1, a) 


13 


end 




14 


m <— (n mod 136) mad 8 




15 


i ixsearch ((Ta+o+ng+i, . . 


• > r 1+t)+n2+ni ) , ?ii — 1, a) 


16 


return T 2+c , +n2+2m [i mod 8 . 


. . (imOclS) + 1] T 2 +o+n2+ni-N 


17 



If i < ri2 the lookup continues by searching in the (i + l)th complete Ll tree 
stored in the tail at t + i (line 13). Otherwise, the lookup is completed by searching 
the partial Ll tree (leaf) stored in the head. This is accomplished as follows. First the 
size ni of the partial Ll tree is computed (line 15). The range boundaries r u . . . , r ni „ x 
are stored in 

the low bits of the next-hop indices h, . . . , l ni are stored in 

*■ 

and the Mgh bits of the next-hop indices hi,... ,hk t are stored in T 2+ o4-n 2 +2ni * It 
remains to perform the index search (line 16) and extract and assemble the next-hop 
index (line 17). 

The straight forward lookup DLTJookup 2 * 3 (below) is slightly simpler. It accepts 
three arguments: T, which refers to a array of 16 bits memory slots Ti, . . . , Ti 6? the level 
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t . and the query key a. Independent on the value of t. the tree referred to by T might 
be either a node or a leaf. Nevertheless, there are 7 range boundaries . . . } 7*7 stored 
in Ti, . . . , T 7 . A node it contains 16 range boundaries r± 9 . . . , rw stored in Ti, . . . , Tic. 
The first two range boundaries of a node are stored in sorted order. This is tested 
in line 1 followed by performing an index search. Thereafter, the DLT lookup 2 ** is 
recursively called to search the next level sub-tree stored at T + 1 + i • DLTsize (t) 
(see line 2 ... 3 and the definition of DLT size in Section 5.2.4). 



DLTJookup 2 ' 3 (T. i, 0) 




if T x < T 2 then 


1 


i <— ixsearch ((Ti, . . . , Tie) > 16, a) 


2 


return DLTJookup** 8 (T + l + i- DLTsize. (t) , f - 


l,o) 3 


end 


4 


i <- ixsearch {T 2 , Ti, T 3 . . . , TV), 7, a 


5 


\ Ti and 7'a are swapped / 




return T X6 [« mod 8 . . . (i mod 8) + 1] T s +< 


6 



A leaf , it is organized as follows: the range boundaries i\ , . . . , r 7 stored in 2\ , . . . , T 7 , 
low bits from next-hop indices . . . , h stored in T 8? . . . , 2"i 5; and the high bits from 
next-hop indices hi, . . . , hg are stored in T 16 . Searching the leaf is simply accom- 
plished by first swapping the two first range boundaries, performing an index- search 
to compute i and then swapping the first two range boundaries back (line 5) v The 
last step is to assemble the pieces from the next-hop index and return the result. 

5.3,5 Static Flat Tree Lookup 

A Static Flat Tree (SFT) is very similar to a dynamic layered tree leaf. The first 
difference is that the size of the range boundaries is 4 bits in the SFT compared to 16 
in the DLT. This requires a special version ixsearch 1 for searching the 4 bits range 
boundaries. The second difference is the number of range boundaries which is 10 in 
the SFT compared to 7 in the DLT. The tree consists of an array of 16 bits memory 
slots 21; ... . Ti 6 organized as follows: the 4 bit range boundaries r l5 . . . , r 10 are stored 
in 2i...3j the low 16 bits of the next-hop indices l/i, . . . , In are stored in T 4 . . . . .Ti 4 , 
and the high 2 bits of the next-hop indices are stored in Ti5...ic« 
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SFTJookup (T, n, a) 

i *- izsearch* {(T L [3 . . . 0] , . . . , T 3 [7 . . . 4]) , 10 ; a) 1 
return Tj | 1 [i mod 8 . . . (j mod 8) + 1] T 4+ ./ 2^ 

As in DLT leafs, lookup is achieved by performing an index search (line 1). The 
complete array of 4 bits range boundaries is given by 

(r 1 [3...o] ? r 1 [7...4] > T 1 [n...8] ? r 1 [i5...i2] ; r 2 [3...o], 

T 2 [7...4],T 2 [11...8],r 2 [15...12] ; T3[3...0],^[7...4]). 

This is followed by accessing and assembling the next-hop index and returning the 
result (line 2). 

5.4 Router System Architecture 

Figure 7 is a schematic block diagram of a router system architecture chosen as an 
example. There are three autonomous processes: the Routing Process, the Builder 
Process, and the Forwarding Process, and three data structures: the Routing Table, 
the Forwarding Table, and the Next-hop Table. 

All datagrams entering and exiting the system are handled by the Forwarding 
Process, The basic operation is to receive a packet (f pi), lookup the next-hop index 
in the forwarding table using SVNJookup (T, a) where T is the forwarding table 
and a is the destination address (fp2) r retrieve the next-hop information from the 
next-hop table (fp3), and forward the packet according to the next-hop information 
(f P 4). 

The Routing Process, or Routing Protocol Daemon, is a process that communicates 
with other systems in the router's neighborhood using a routing protocol to learn 
about changes in network topology and to update the routing table to accommodate 
for such changes. An example of a routing process, or more precisely a computer 
program product defining the routing process, is the Gate Daemon (gateD) available 
from ww.gated.org. Different routing protocols works in different ways and use 
different optimization criteria to determine which is the most efficient path through 
the network and thereby to determine what updates to perform on the routing table. 
The basic operation of all router processes is however to detect or learn about network 
topology changes by communicating according to the routing protocol (rpl), update 
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the routing table to accommodate for these changes (rp2) ; and inform the neighbors 
(routing processes) about, topology changes (rp3). 

As opposed to the dynamic routing table, the semi-static forwarding table is not 
designed to be directly manipulated by the routing process. Dynamic and static 
here refers to the computational complexity of performing table updates - dynamic 
means easy and static means that the complete table must be rebuild. Modifications 
of the routing table are propagated into modifications of the forwarding table in a 
controlled manner by the workings of a Builder Process. There are several ways of 
scheduling forwarding table rebuilds or partial rebuilds and a commonly used method 
is to slightly postpone the next rebuild after detecting a routing table update. For 
example, the builder process can accumulate all changes to the routing table during 
a pre-determined period of time before executing the rebuild. In this way, a burst 
of routing table modifications (which is not rmcommon) can be handled by a single 
forwarding table rebuild (or update). The basic operation of the Builder Process is 
to wait until a routing table update occurs (bpl), retrieve the range boundaries and 
next-hop indices using "the generic method described in Section 5.1.2 (bp2), or a more 
efficient method optimized for the routing table representation used, and rebuild the 
forwarding table according to the representation described in Section 5.2 (bp3). 

5.5 Small Routing and Forwarding Tables 

A simple and straight forward way of representing the routing table is to store address 
prefixes together with next-hop indices (NH) in entries that are linked to each other 
in a layered tree-like structure as shown in Figure 7. Each entry has a pointer to the 
next entry to the right (if present) and to the first child entry (if present). Entries at 
the same level are stored in sorted order with respect to the smallest address matching 
the prefix of the entry (i.e. the starting point of the range constituted by the prefix). 
Entries with prefixes that are sub-sets of the prefix of a given entry axe stored in levels 
below the given entry. The detailed information stored in the routing table entries 
shown in Figure 7 is shown in the table below. 
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'In this example, the result from traversing the routing table and compute the 
intermediate range partition representation results in the following. 
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To represent this in a forwarding table, we use a special kind of Dynamic Layered 
Tree suitable for smaller routing tables. The principle is exactly the s^me as for regular* 
DLTs but instead of using 16 bits range boundaries, we use 32 bits range boundaries. 
Moreover, the number of bits used for representing next-hop indices is reduced to 16. 
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In this data structure which we call 32-bit Dynamic Layered Tree a node consists of . 
8 range boundaries instead of 16 and a leaf contains 5 range boundaries and 6 next- 
hop indices instead of 7 and 8 respectively. This means that we can represent up to 
6*9*9*9 = 4374 ranges or 2187 routes using this technique without exceeding the 
design limit of 4 memory accesses. In this configuration, the maximum amount of 
memory required to represent 2187 routes is 

32 4- 9 - (32 + 9 • (32 + 9 • 32)) = 26240 bytes. 

The 32-bit DLT build representing the range partition above is shown in Figure 8. 
Each row represents a cache line that stores either a node (the first three rows) or 
a leaf (the last five rows). Observe that quantization effects axe disregarded which 
means that this particular representation is slightly less memory space efficient than 
the full flow compressed forwarding table for large routing tables. The customized 
lookup and index search procedures for 32-bit DLTs are shown in C-code below. 
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static inline int 

DLT32_ix_search5(u_int32_t *tkey, u_int32_t qkey) 

■c 

u_int32_t *base - tkey; 

tkey += *tkey <= qkey; 

tkey += (*(tkey + 2) <= qkey) « 1; 

tkey += (*(tkey + 1) <= qkey) « 0; 

return (int) ((tkey - base) + (*tkey <= qkey)); 



static inline int 

DLT32_ix_search8(u_int32_t *tkey, u_int32_t qkey) 
{ 

u_int32_t *base = tkey; 

tkey += (*(tkey + 4) <= qkey) « 2; 

tkey += (*(tkey + 2) <= qkey) « 1; 

tkey += (*(tkey + 1) <= qkey) « 0; 

return (int) ((tkey - base) + (*tkey <= qkey)); 



static inline int 

DLT32_lookup(u_int32_t *ft, u_int32_t key) 
{ 

f t +- 8 + _FT_SIZE_3 * _f t_qsearcli8 (ft , key) ; 
f t += 8 + _FT_SIZE_2 * _f t_qsearch8 (ft , key) ; 
f t += 8 + _FT_SIZE_1 * _ft_qsearch8(ft, key); 
return ((u_intl6_t *) (f t + 5) ) [_f t_qsearch5 (f t , key)]; 

> 



> 



> 



#define _FT_SIZE_1 (8 

#def ine _FT_SIZE_2 (8 + 9 * 

#define _FT_SIZE_3 (8 + 9 * 



) 

FT_SIZE_1) 
.FT_SIZE_2) 
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DLT32_ix_search5 and DLT32_ix_search8 are custom index search procedures 
for searching among 5 and 8 range boiindaries respectively. Since both the number 
of levels (four) and the number of range boundaries to search is known in each step 
the lookup can be performed as a deterministic sequence of arithmetical operations, 
without conditional branching, to fully exploit the pipelining of instructions performed 
by the GPU. 



34 



SUBSTITUTE SHEET (RULE 26) 



WO 03/063427 



PCT/SE03/00064 



6 Claims 

1. A method for IP routing lookup to determine where to forward an IP-datagram 
with a given destination address by retrieving from a routing table a next-hop 
index indicating where to forward said datagram, said next-hop index being 
associated with the longest matching prefix, of said destination address, said 
address being a number in an address universe U, wherein a set of address 
prefixes P and a mapping of P onto a set of next-hop indices D are converted 
into a set of ranges R, constituting a partition of U, and a mapping of R 
onto D, characterized by the steps of building and storing in a memory a 
forwarding table representation from R and D by using a predetermined layered 
data structure where the construction of the layer is selected depending on the 
range density |R'| for the sub-iiniverse U' represented by that layer to get a 
space efficient representation of the set of ranges R, and performing the lookup 
by a range matching operation in said forwarding table. 

2. The method according to claim 1 , characterized in that the number of routes 
to be represented in the forwarding table and a maximum allowable number of 
memory accesses for the lookup are prescribed, whereupon said data structures 
are selected as construction blocks for the building of the forwarding table in 
such a way that memory needs are minimized. 

3. The method according to claim 1, characterized in that the number of routes 
to be represented in the forwarding table and a maximiim available memory 
capacity are prescribed, whereupon said data structures are selected , as con- 
struction blocks for the building of the forwarding table in such a way that the 
number of memory accesses for the lookup are minimized, 

4. The method according to any of the preceding claims, characterized in that 
said data structures for the building of the forwarding table are selectable among 
the structures Dynamic Flat Tree, Dynamic Layered Tree and Static Flat Tree 
and variations thereof. 

5. The method according to any of the preceding claims, characterized in that 
to repeatedly reduce the size of the address sub-universes in successive levels 
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a Static Vector Node (SVN) representation is used in a first level for the for- 
warding table representation, one of a first set of said predetermined layered 
data structures, selectable depending on the actual value of the range density 
|R*| to give the most efficient table representation, is chosen for building the 
forwarding table in a second level, if |R £ | is less than a predetermined first limit 
value, otherwise a SVN representation is used, in a third level one of a second 
set of said predetermined data structures, selectable depending on the actual 
value of the range density |R^| to give the most efficient table representation, 
is chosen for building the forwarding table in this third level, if |Ry | is less than 
a predetermined second limit value, otherwise a SVN representation is used, etc, 
until all levels have been represented. 

6. The method according to claim 5, characterized in that an initial SVN lookup 
by direct indexing is performed in said first level to retrieve a pointer that 
either represents a next-hop index, which means that the lookup is completed; 
or refers to a sub-data structure where the lookup is continued on said second 
leveL and, depending on the value of |R*|, encoded in the pointer value and 
obtained by decoding the pointer value, a lookup is completed in a forwarding 
table represented by one of said first set of said predetermined layered data 
structures if |R*| is less than said predetermined first limit value, otherwise a 
SVN lookup by direct indexing is performed in said second level to retrieve a 
pointer that either represents a next-hop index, which means that the lookup 
is completed, or refers to a sub-data structure where the lookup is continued 
on said third level and, depending on the value of |Rij| } encoded in the pointer 
value and obtained by decoding the pointer value, a lookup is completed in a 
forwarding table represented by one of said second set of said predetermined 
layered data structures if |Ry| is less than said predetermined second limit 
value, otherwise a SVN lookup by direct indexing is performed etc, until the 
lookup is completed. 

7. The method according to any of the claims 1 - 2 or 4 - 6 characterized in 
that the maximum number of routes is fixed to 2 18 routes and the maximum 
allowable number of memory accesses to 4, whereupon said data structures are 
selected as construction blocks for the building of the forwarding table in such 
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a way that memory needs are less than 2/7 Mbytes. 

8. The method according to any of the claims 1 or 3 - 6, characterized in that 
the maximum number of routes is fixed to 2 1S routes and the maximum available 
memory capacity to 2.7 Mbytes, whereupon said data structures are selected as 
construction blocks for the building of the forwarding table in such a way that 
the number of memory accesses for the lookup is less than or equal to 4. 

9. The method according to any of the preceding claims, characterized in that 
to repeatedly reduce the size of the address sub-tmiverses in successive levels a 
Static Vector Node (SVN) representation of size 2 16 pointers is used in a first 
level for the forwarding table representation, one of a first set of said predeter- 
mined layered data structures, selectable depending on the actual value of the 
range density |R*| to give the most efficient table representation, is chosen for 
building the forwarding table in a second level, if |R»| is. less than a predeter- 
mined first limit value 2313, otherwise a SVN representation of size 2 8 pointers 
is used, in a third level one of a second set of said predetermined data structures, 
selectable depending on the actual value of the range density \Rij\ to give the 
most efficient table representation, is chosen for building the forwarding table 
in this third level, if \Rij\ is less than a predetermined second limit value 137, 
otherwise a SVN representation of size 2 4 pointers is used, and in a fourth level 
one of a third set of said predetermined data structures, selectable depending 
on the actual value of the range density |Ry,fc| to give the most efficient table 
representation, is chosen for building the forwarding table in this forth level, 
if IRjj^l is less, than a predetermined third limit value 12, otherwise a SVN 
representation of size 2 4 pointers is used. 

10. The method according to claim 9, characterized in that an initial SVN lookup 
by direct indexing is performed in said first level to retrieve a pointer that 
either represents a next-hop index, which means that the lookup is completed, 
or refers to a sub-data structure where the lookup is continued on said second 
level and, depending on the value of |Ri|, encoded in the pointer value and 
obtained by decoding the pointer value, a lookup is completed in a forwarding 
table represented by one of said first set of said predetermined layered data 
structures if |R*| is less than said predetermined first limit value 2313, otherwise 
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a SVN lookup by direct indexing is performed in said second level to retrieve a 
pointer that either represents a next-hop index, which means that the lookup 
is completed, or refers to a sub-data structure where the lookup is continued 
on said third level and, depending on the value of |R*j|, encoded in the pointer 
value and obtained by decoding the pointer value, a lookup is completed in 
a forwarding table represented by one of said second set of said predetermined 
layered data structures if |Rjj | is less than said predetermined second limit value 
137, otherwise a SVN lookup by direct indexing is performed in said third level 
to retrieve a pointer that either represents a next-hop index, which means that 
the lookup is completed, or refers to a sub-data structure where the lookup is 
continued on said fourth level and, depending on the value of iRij.fcj, encoded 
in the pointer value and obtained by decoding the pointer value, a lookup is 
completed in a forwarding table represented by one of said third set of said 
predetermined layered data structures if |Rjj,fc| is less than said predetermined 
third limit value 12, otherwise the lookup is completed by a final SVN lookup 
using direct indexing in said fourth leveL 

11. The method according to claims 9 or 10, characterized in that in said second 
level the universe U is partitioned into 2 16 sub-universes U 0; Ui 7 . . . , U 2 i6-i, 
each of size 2 16 , and in that the set of ranges R is processed into R 0 , Ri, . . . , R^-i 
and represented by the sub-data structures T 0 , Ti, : . . , T 2 ie_i. 

12. The method according to claim 11, characterized in that in said third level each 
sub-universe U$ is partitioned into 2 8 srib-iuiiverses U^cTJ^i, . . . , U^a-i, each 
of size 2 s , and in that the set of ranges R* is processed into R ? ; 0 , R 7 ;,i, . . . , R^-i 
and represented by the sub-data structures T^T^i, . . . .T^s-i- 

13. The method according to claim 12, characterized in that in said fourth level 
each sub-universe XJ-ij is partitioned into 2 4 sub-universes Uij,o, U<j,i, JU%j^-i 9 
each of size 2 4 , and in that the set of ranges Ry is processed into 

and represented by the sub-data structures T^p. Ty,i, . . . ,T^ j2 4_i. 

14. The method according to any of the claims 9 - 13, characterized in that said 
first set of data structures comprises Dynamic Flat Tree and Dynamic Layered 
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Tree struct tires. 

15. The method according to any of the claims 9- 14, characterized in that said 
second set of data structures comprises Dynamic Layered Tree structures. 

16. The method according to any of the claims 9 - 15, characterized in that said 
third set of data structures to be used in said fourth level of the lookup procedure 
comprises Dynamic Layered Tree and Static Flat Tree structures. 

17. The method according to any of claims 1 - 2 or 4 - 6, characterized in that 
the maximum number of routes is fixed to 2187 and the maximum allowable 
number of memory accesses to 4, whereupon a 32 bits variation of the data 
structure Dynamic Layered Tfcee is used as construction block for the building 
of the forwarding table such that the memory needs equal 26240 bytes. 

18. A system for IP routing lookup to determine where to forward an IP-datagram 
with a given destination address by retrieving from a routing table a next-hop 
index indicating where to forward said datagram, said next-hop index being 
associated with the longest mat clung prefix of said destination address, said ad- 
dress being a number in an address universe U, said system comprising a first 
converting means for converting a set of address prefixes P into a set of ranges 
R constituting a partition of said universe U and a second converting means for 
converting the mapping from P onto a set of next-hop indices D to an equiv- 
alent mapping from R onto Z?, characterized by data structuring means for 
forming predetermined layered data structiixes T representing the routing table, 
and building and memory means for building and storing a forwarding table 
representation from R and D by using a predetermined layered data structure 
where the construction of the layer is selected depending on the range density 
|R'| for the sub-universe U' represented by that layer to get a space efficient 
representation of the set of ranges R, and means for performing the lookup by 
a range matching operation in said forwarding table. 

19. The system according to claim 18, characterized in that said forwarding table 
building and storing means are arranged to repeatedly reduces the size of the 
address sub-universes in successive levels a Static Vector Node (SVN) represen- 
tation is used in a first level for the forwarding table representation, one of a 

39 



SUBSTITUTE SHEET (RULE 26) 



WO 03/063427 



PCT/SE03/00064 



first set of said predetermined layered data structures, selectable depending on 
the actual value of the range density |Rj| to give the most efficient table repre- 
sentation, is chosen for building the forwarding table in a second level, if |R/| 
is less than a predetermined first limit value, otherwise a SVN representation is 
used, in a third level one of a- second set of said predetermined data structures, 
selectable depending on the actual value of the range density |R^| to give the 
most efficient table representation, is chosen for building the forwarding table in 
this third level, if |Ry | is less than a predetermined second limit value, otherwise 
a SVN representation is used etc. until the building and storing is completed. 

20. The system according to any of the claims 18 or 19, characterized in that 
said lookup informing means are arranged to perform an initial SVN lookup 
by direct indexing in said first level to retrieve a pointer that either represents 
a next-hop index, which means that the lookup is completed, or refers to a sub- 
data structure where the lookup is continued on said second level and, depending 
on the value of |Rj|, encoded in the pointer value and obtained by decoding the 
pointer value, a lookup is completed in a forwarding table represented by one 
of said first set of said predetermined layered data structures if |R»| is less than 
said predetermined first limit value, otherwise a SVN lookup by direct indexing 
is performed in said second level to retrieve a pointer that either represents a 
next-hop index, which means that the lookup is completed, or refers to a sub- 
data structure where the lookup is continued on said third level and, depending 
on the value of encoded in the pointer value and obtained by decoding the 
pointer value, a lookup is completed in a forwarding table represented by one 
of said second set of said predetermined layered data structures if |Ry | is less 
than said predetermined second limit value, otherwise a SVN lookup by direct 
indexing is performed etc. until the lookup is completed. 

21. The system according to any of the claims 18 - 20, characterized in that 
the maximxim number of routes is fixed to 2 18 routes, the maximum number 
of memory accesses for looking up the next-hop index is 4, and whereupon 
said data structures are selected as construction blocks for the building of the 
forwarding table in such a way that memory needs are less than 2.7 Mbytes. 
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22. The system according to any of the claims 18 - 21, characterized in that 
said forwarding table building and storing means are arranged to repeatedly 
reduces the size of the address sub-universes in successive levels a Static Vector 
Node (SVN) representation of size 2 16 pointers is used in a first level for the 
forwarding table representation, one of a first set of said predetermined layered 
data structures, selectable depending on the actual value of the range density 
|Ri| to give the most efficient table representation, is chosen for building the 
forwarding table in a second level, if |R*| is less than a predetermined first limit 
value 2313, otherwise a SVN representation of size 2 8 pointers is used, in a 
third level one of a second set of said predetermined data structures, selectable 
depending on the actual value of the range density "jRy | to give the most efficient 
table representation, is chosen for building the forwarding table in this third 
level, if |Ry| is less than, a predetermined second limit value 137, otherwise a 
SVN representation of size 2 4 pointers is used, and in a fourth level one of a third 
set of said predetermined data structures, selectable depending on the actual 
value of the range density |Rij,*| to give the most efficient table representation, 
is chosen for building the forwarding table in this forth level,, if \R>ij,k\ is less 
than a predetermined third limit value 12, otherwise a SVN representation of 
size 2 4 pointers is used. 

23. The system according to any of the claims 18 - 22, characterized in that said 
lookup performing means axe arranged to perform an initial SVN lookup by 
direct indexing in said first level to retrieve a pointer that either represents a 
next-hop index, which means that the lookup is completed, or refers to a sub- 
data structure where the lookup is continued on said second level and, depending 
on the value of |R*| , encoded in the pointer value and obtained by decoding the 
pointer value, a lookup is completed in a forwarding table represented by one of 
said first set of said predetermined layered data structures if |R*| is less than said 
predetermined first limit value 2313, otherwise a SVN lookup by direct indexing 
is performed in said second level to retrieve a pointer that either represents a 
next-hop index, which means that the lookup is completed, or refers to a sub- 
data structure where the lookup is continued on said third level and, depending 
on the value of |Ry|, encoded in the pointer value and obtained by decoding 
the pointer value, a lookup is completed in a forwarding table represented by 

41 

: i 

t 4 . i 

SUBSTITUTE SHEET (RULE 26) 



WO 03/063427 



PCT/SE03/00064 



one of said second set of said predetermined layered data structures if |R*j| is 
less than said predetermined second limit value 137, otherwise a SVN lookup 
by direct indexing is performed in said third level to retrieve a pointer that 
either represents a next-hop index, which means that the lookup is completed, 
or refers to a sub-data structure where the lookup is continued on said fourth 
level and, depending on the value of |Ry, fc |, encoded in the pointer value and 
obtained by decoding the pointer value, a lookup is completed in a forwarding 
table represented by one of said third set of said predetermined layered data 
structures if |Ry,fe| is less than said predetermined third limit value 12, otherwise 
the lookup is completed by a final SVN lookup using direct indexing in said 
fourth level. 

24. The system according to any of the claims 21 23, characterized in that said 
data structuring means are adapted to partition, in a first level, the universe U 
into 2 16 sub-universes 

U 0 ,Ui,.- .,U 2 ie-l, 

each of size 2 16 , and to process the set of ranges R into 

Ro 5 Ri, . . . , R2i6_ i, 
represented by the sub-data structures T 0 . T l5 . . . , T 2 ie-i. 

25. The system according to claim 24, characterized in that said data structuring 
means are adapted to partition, in a second level, each sub-iuriverse U* is into 

, 2 8 sub-universes 

U^o? U*,ij • • • j U i)2 8_i, 
each of size 2 s , and to process the set of ranges R* into 

R»,0? Ri,l? • • ■ j R?;>2 8 -1; 

represented by the sub-data structures 2\o>2\i ? • • • } 2^2 8 -i- 

26. The system according to claim 25, characterized in that said data structuring 
means are adapted to partition, in a third level, each sub-miiverse XJ iyj into 2 4 
sub- universes . ; 

Uij f o,U Wl i,.-.,U^2Vi' 
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each of size 2 4 , and to process the set of ranges R^- into 

represented by the sub-data structures Tz, ; / 5 oi T^i, .... ^^-i- 

27. The system according to any of the claims 18 - 26, characterized in that said 
data structuring means are adapted to form data structures comprising Dynamic 
Flat Tree, Dynamic Layered Tree and Static Flat Tree data structures. 

28. The system according to any of the claims 18 - 20, characterized in that said 
data structuring means is adapted to form a 32 bits variation of a Dynamic 
Layered Tree data structure as construction block for building of a forwarding 
table such that memory needs equal 26240 bytes for a maximum number of 
routes of 2187 and a maximum number of memory accesses of 4. 

29. A compxiter program product, characterized by computer program code means 
to make a computer execute the method according to any of the claims 1-17 
when the program is rim on a computer. 
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